Francis Williams, Bangor University,
f.williams@bangor.ac.uk PRIMARY
William Faithfull, Bangor University,
w.faithfull@bangor.ac.uk
Jonathan Roberts, Bangor University,
j.c.roberts@bangor.ac.uk
Student Team: YES
We developed the SitaVIS tool for the 2012 challenge in C# using Microsoft's XNA framework in Microsoft Visual Studio 2010 professional alongside MySQL Community Server.
Using a component-based architecture we developed specific linked-views for the visualization including: a vector image of the map showing regions, histograms, stream graphs, dense-pixel plots of individual machines and an interface. A hierarchical drill-down interface methodology was employed for the visualization to enable different times and hypothesis to be explored. Fast redrawing of all the nodes was required, which was achieved through Microsoft’s XNA framework to create a real-time environment.
Video:
Answers to Mini-Challenge 1 Questions:
MC 1.1 Create a visualization of the health and policy status of the entire Bank of Money enterprise as of 2 pm BMT (BankWorld Mean Time) on February 2. What areas of concern do you observe?
Figure 1 shows the status of the Bank of Money enterprise at 2 pm.
Figure 1. All facilities and regions at 14:00 on 02/02/2012. The worst status of all nodes of a facility is shown by the color. This demonstrates that every facility has at least one machine at policy status 2 and 3, with some at 4 and one at 5.
The first concern is the lack of data in Region 25 (Figure 1, green triangle, right). 16 facilities in this region have no data at this time, with a further 19 facilities including the region headquarters exhibiting the same behaviour over the following 7 hours. We suspect that this is an unexpected system failure or power outage: because an attack would demonstrate a large spike in activity flag 2 for normal activity (a machine going down for maintenance), however, Figure 2 shows no such spike (shown in left graph), which leads us to this conclusion.
Second concern is over Datacenter-2 icon that is highlighted in red (Figure 1), which demonstrates the presence of at least one machine displaying a policy status of level 5 (Possible virus infection / questionable files). Figure 3 demonstrates a dense pixel plot of all machines at Datacenter-2, colored according to their policy statuses and sorted by IP address. This shows that the datacenter has 4 servers at policy status 4 (critical policy deviations / patches failing) and 1 server at policy status 5. Holistically, it is the only facility highlighted in red, meaning that this is the only machine at policy status 5 in the entire network.
Figure 2. For Region 25, the graph (left) shows the Activity Flag 2, (right) right graph shows the power outage.
Figure 3. Dense pixel-plot of Datacentre-2 policy statuses sorted by IP address.
MC 1.2 Use your visualization tools to look at how the network’s status changes over time. Highlight up to five potential anomalies in the network and provide a visualization of each. When did each anomaly begin and end? What might be an explanation of each anomaly?
Our process involved developing SitaVIS to specifically analyse this data. We describe the parts of SitaVIS in Figure 4. We first created a relational database of 7 core tables, to reduce duplication and to index by integer values rather than strings. We did some initial queries to explore the data at this initial stage, then we used C# and Microsoft XNA to develop a visual-query interface. SitaVIS enables the analyst to filter the data and display large datasets fast. The code is optimized for real-time interaction of this dataset, as demonstrated by the video. The buttons setup the query to the database, and the user chooses the time-point (Tp) to display on the map, further queries are done on roll-over of facilities and zooming into specific areas. The graphs show a timeline of the whole data.
The general investigative methodology was to provide an overview of the data (such as the map in Figure 1 in MC1.1) and linked views to drill-down into specific data values. To generate the overview we aggregate the different policy status’ to merely display the worst policy of a region (we have explored different aggregation techniques). We can then drill down to display dense-pixel plots of the ports (Fig MC1.1 Image1) or change the aggregation of the filter. We have also explored different visual orderings of the dense-pixel plot to explore phenomena such as port-scans, denial of service attacks.
Figure 4. Screen shot of SitaVIS.
Anomaly 1.
All servers of Datacenter5 (DC5) are offline from 8:15 (the start).
Through eyeballing the whole visualization of connections per machine, we notice that DC5 has a low quantity of connections. Color is mapped onto a normalized value at a time point. Figure 5 shows that Datacenter5 is mapped to green with the other datacenters in purple therefore demonstrating this anomaly. A further drill-down can be achieved by zooming in, and hovering on this datacentre – and shows no servers working and only 3 workstations. A small quantity of servers come online at 10:30 for 1 hour. Servers coming online in stages from 12.30 to full functionality at 19:15, during this time period, one server reaches policy 5 at 18.15 as shown in Figure 6.
Figure 5 demonstrates inactive servers on Datacenter-5. The graph in the bottom right shows this lack of connection.
Figure 6 Shows a zoom in of Datacenter-5 showing the dense pixel of machines at the facility, ordered by IP address and colored by policy status.
Anomaly 2.
All machines at the start, in Region 5 and region 10, show policy status 2 (Moderate policy deviations).
It is atypical to have identical policies and also we would assume some machines to be at other policy levels. We calculated the proportion of the machines at each policy level at a facility; normalize this to the whole data on each policy level. This normalized value is then mapped to color (green for the lowest value and through blue, purple, orange to red for the highest value), as shown in Figure 7.
Figure 7, shows region 5 and 10 with all policy status at level 2 (moderate policy deviations). The color represents a normalized value (green for the lowest value and through blue, purple, orange to red for the highest value).
Anomaly 3.
All datacenters have at least one machine at policy 5 at 19:00 on 02/02/2012.
It would not be suitable to have a compromised machine at each datacenter. By choosing the policy status filter in SitaVIS we were able to visualize the map of all datacenters and facilities and then scroll the time point and explore this occurrence; Figure 8 shows a screen shot of this.
![]()
Figure 8, shows the first time point when all the datacenters include at least one server at policy status 5.
Anomaly 4.
At 21:15, HQ has a single workstation with policy status of 5.
Furthermore by 22:00 HQ no longer has any workstations at level 5. Then at 00:00 a machine changes to status 5, but has a different IP address (see Figure 9); this machine remains at level 5 until 03:00 when the worst policy status of HQ returns to 4. Finally at 03:15 another different IP flags at policy level 5 (see Figure 10)
Figure 9 demonstrates that the HQ has one workstation at policy level 5, shown in red in the dense pixel plot, at 00:00.
Figure 10 demonstrates a dense-pixel plot of HQ at 03:15 and a different IP flag to that of the earlier time-stamp on Figure8.
Anomaly 5
At the end of the time period 8am 04/02/2012 less than 50% of machines in the entire network are registering a normal policy and every facility has machines at policy 3, 4 or 5. The stacked timeline in Figure 11 demonstrates this proportion.
Figure 11 shows a stacked line-graph, of an accumulation of each policy status over the whole network.